Materials for “ Statistical - Computational Phase Transitions in Planted Models : The High - Dimensional Setting ”

نویسندگان

  • Yudong Chen
  • Jiaming Xu
چکیده

We provide the proofs for the theorems in the main paper. 1 Proofs for Planted Clustering In this section, Theorems 1–6 refer to the theorems in the main paper. Equations are numbered continuously from the main paper. We let n1 := rK and n2 := n − rK be the numbers of nonisolated and isolated nodes, respectively. 1.1 Proof of Theorem 1 The proof relies on information theoretical arguments and the Fano’s inequaliy [4]. We use D (Ber(p)‖Ber(q)) to denote the KL divergence between two Bernoulli distributions with mean p and q. We first state an upper bound on D (Ber(p)‖Ber(q)), which is used later in the proof: D (Ber(p)‖Ber(q)) = p log p q + (1− p) log 1− p 1− q (a) ≤ pp− q q + (1− p) − p 1− q = (p− q)2 q(1− q) , (16) where (a) follows from the inequality log x ≤ x− 1,∀x ≥ 0. Let P(Y ∗,A) be the joint distribution of Y ∗ and A when Y ∗ is sampled from Y uniformly at random and A is generated according to the planted clustering model. Because the supremum is lower bounded by the average, we have inf Ŷ sup Y ∗∈Y P [ Ŷ 6= Y ∗ ] ≥ inf Ŷ P(Y ∗,A) [ Ŷ 6= Y ∗ ] . (17) Let H(X) be the entropy of a random variable X and I(X;Z) the mutual information between X and Z. By Fano’s inequality, we have for any Ŷ , P(Y ∗,A)(Ŷ 6= Y ∗) ≥ 1− I(Y ∗;A) + 1 log |Y| . (18) Simple counting gives that |Y| = ( n n1 ) n1! r!(K!)r . Note that ( n n1 ) ≥ ( n n1 ) n1 and √ n(ne ) n ≤ n! ≤ e √ n(ne ) n. It follows that |Y| ≥ (n/n1) √ n1(n1/e) n1 e √ r(r/e)rerKr/2(K/e)n1 ≥ ( n K )n1 1 e(r √ K)r .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting

The planted models assume that a graph is generated from some unknown clusters by randomly placing edges between nodes according to their cluster memberships; the task is to recover the clusters given the graph. Special cases include planted clique, planted partition, planted densest subgraph and planted coloring. Of particular interest is the high-dimensional setting where the number of cluste...

متن کامل

Statistical-Computational Tradeoffs in Planted Models: The High-Dimensional Setting

The planted models assume that a graph is generated from a set of clusters by randomly placing edges between nodes according to their cluster memberships; the task is to recover the clusters given the graph. Special cases include planted clique, planted partition and planted coloring. This paper studies the statisticalcomputational tradeoffs of these models. Our focus is the high-dimensional se...

متن کامل

Sharp Computational-Statistical Phase Transitions via Oracle Computational Model

We study the fundamental tradeoffs between computational tractability and statistical accuracy for a general family of hypothesis testing problems with combinatorial structures. Based upon an oracle model of computation, which captures the interactions between algorithms and data, we establish a general lower bound that explicitly connects the minimum testing risk under computational budget con...

متن کامل

شبیه سازی ذوب سیستمهای دو بعدی

  The study of a two-dimensional (2-D) system started nearly half a century ago when Peierls and Landau showed the lack of long range translational order in a two-dimensional solid. In 1968, Mermin proved that despite the absence of long range translational order. Two-dimensional solids can still exhibit a different kind of long range bond orientation. During the last decade, fascinating theori...

متن کامل

Experimental study and numerical simulation of three dimensional two phase impinging jet flow using anisotropic turbulence model

Hydrodynamic of a turbulent impinging jet on a flat plate has been studied experimentally and numerically. Experiments were conducted for the Reynolds number range of 72000 to 102000 and a fixed jet-to-plate dimensionless distance of H/d=3.5. Based on the experimental setup, a multi-phase numerical model was simulated to predict flow properties of impinging jets using two turbulent models. Mesh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014